49 research outputs found

    Robust Fusion of LiDAR and Wide-Angle Camera Data for Autonomous Mobile Robots

    Get PDF
    Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of autonomous vehicles. These heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor for free space detection. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression-based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a uncertainty aware free space detection algorith

    Passing Heatmap Prediction Based on Transformer Model and Tracking Data

    Full text link
    Although the data-driven analysis of football players' performance has been developed for years, most research only focuses on the on-ball event including shots and passes, while the off-ball movement remains a little-explored area in this domain. Players' contributions to the whole match are evaluated unfairly, those who have more chances to score goals earn more credit than others, while the indirect and unnoticeable impact that comes from continuous movement has been ignored. This research presents a novel deep-learning network architecture which is capable to predict the potential end location of passes and how players' movement before the pass affects the final outcome. Once analysed more than 28,000 pass events, a robust prediction can be achieved with more than 0.7 Top-1 accuracy. And based on the prediction, a better understanding of the pitch control and pass option could be reached to measure players' off-ball movement contribution to defensive performance. Moreover, this model could provide football analysts a better tool and metric to understand how players' movement over time contributes to the game strategy and final victory

    Embedding Contextual Information through Reward Shaping in Multi-Agent Learning: A Case Study from Google Football

    Full text link
    Artificial Intelligence has been used to help human complete difficult tasks in complicated environments by providing optimized strategies for decision-making or replacing the manual labour. In environments including multiple agents, such as football, the most common methods to train agents are Imitation Learning and Multi-Agent Reinforcement Learning (MARL). However, the agents trained by Imitation Learning cannot outperform the expert demonstrator, which makes humans hardly get new insights from the learnt policy. Besides, MARL is prone to the credit assignment problem. In environments with sparse reward signal, this method can be inefficient. The objective of our research is to create a novel reward shaping method by embedding contextual information in reward function to solve the aforementioned challenges. We demonstrate this in the Google Research Football (GRF) environment. We quantify the contextual information extracted from game state observation and use this quantification together with original sparse reward to create the shaped reward. The experiment results in the GRF environment prove that our reward shaping method is a useful addition to state-of-the-art MARL algorithms for training agents in environments with sparse reward signal

    Deep Multi-Critic Network for accelerating Policy Learning in multi-agent environments

    Get PDF
    Humans live among other humans, not in isolation. Therefore, the ability to learn and behave in multi-agent environments is essential for any autonomous system that intends to interact with people. Due to the presence of multiple simultaneous learners in a multi-agent learning environment, the Markov assumption used for single-agent environments is not tenable, necessitating the development of new Policy Learning algorithms. Recent Actor-Critic algorithms proposed for multi-agent environments, such as Multi-Agent Deep Deterministic Policy Gradients and Counterfactual Multi-Agent Policy Gradients, find a way to use the same mathematical framework as single agent environments by augmenting the Critic with extra information. However, this extra information can slow down the learning process and afflict the Critic with Curse of Dimensionality. To combat this, we propose a novel Deep Neural Network configuration called Deep Multi-Critic Network. This architecture works by taking a weighted sum over the outputs of multiple critic networks of varying complexity and size. The configuration was tested on data collected from a real-world multi-agent environment. The results illustrate that by using Deep Multi-Critic Network, less data is needed to reach the same level of performance as when not using the configuration. This suggests that as the configuration learns faster from less data, then the Critic may be able to learn Q-values faster, accelerating Actor training as well

    Learning control policies of driverless vehicles from UAV video streams in complex urban environments

    Get PDF
    © 2019 by the authors. The way we drive, and the transport of today are going through radical changes. Intelligent mobility envisions to improve the e°ciency of traditional transportation through advanced digital technologies, such as robotics, artificial intelligence and Internet of Things. Central to the development of intelligent mobility technology is the emergence of connected autonomous vehicles (CAVs) where vehicles are capable of navigating environments autonomously. For this to be achieved, autonomous vehicles must be safe, trusted by passengers, and other drivers. However, it is practically impossible to train autonomous vehicles with all the possible tra°c conditions that they may encounter. The work in this paper presents an alternative solution of using infrastructure to aid CAVs to learn driving policies, specifically for complex junctions, which require local experience and knowledge to handle. The proposal is to learn safe driving policies through data-driven imitation learning of human-driven vehicles at a junction utilizing data captured from surveillance devices about vehicle movements at the junction. The proposed framework is demonstrated by processing video datasets captured from uncrewed aerial vehicles (UAVs) from three intersections around Europe which contain vehicle trajectories. An imitation learning algorithm based on long short-term memory (LSTM) neural network is proposed to learn and predict safe trajectories of vehicles. The proposed framework can be used for many purposes in intelligent mobility, such as augmenting the intelligent control algorithms in driverless vehicles, benchmarking driver behavior for insurance purposes, and for providing insights to city planning

    Use of Machine Learning to Automate the Identification of Basketball Strategies Using Whole Team Player Tracking Data

    Get PDF
    The use of machine learning to identify and classify offensive and defensive strategies in team sports through spatio-temporal tracking data has received significant interest recently in the literature and the global sport industry. This paper focuses on data-driven defensive strategy learning in basketball. Most research to date on basketball strategy learning has focused on offensive effectiveness and is based on the interaction between the on-ball player and principle on-ball defender, thereby ignoring the contribution of the remaining players. Furthermore, most sports analytical systems that provide play-by-play data is heavily biased towards offensive metrics such as passes, dribbles, and shots. The aim of the current study was to use machine learning to classify the different defensive strategies basketball players adopt when deviating from their initial defensive action. An analytical model was developed to recognise the one-on-one (matched) relationships of the players, which is utilised to automatically identify any change of defensive strategy. A classification model is developed based on a player and ball tracking dataset from National Basketball Association (NBA) game play to classify the adopted defensive strategy against pick-and-roll play. The methodology described is the first to analyse the defensive strategy of all in-game players (both on-ball players and off-ball players). The cross-validation results indicate that the proposed technique for automatic defensive strategy identification can achieve up to 69% accuracy of classification. Machine learning techniques, such as the one adopted here, have the potential to enable a deeper understanding of player decision making and defensive game strategies in basketball and other sports, by leveraging the player and ball tracking data

    An improved model of binocular energy calculation for full-reference stereoscopic image quality assessment

    Get PDF
    With the exponential growth of stereoscopic imaging in various applications, it has become very demanding to have a reliable quality assessment technique to measure the human perception of stereoscopic images. Quality assessment of stereoscopic visual content in the presence of artefacts caused by compression and transmission is a key component of end-to-end 3D media delivery systems. Despite a few recent attempts to develop stereoscopic image/video quality metrics, there is still a lack of a robust stereoscopic image quality metric. Towards addressing this issue, this paper proposes a full reference stereoscopic image quality metric, which mimics the human perception while viewing stereoscopic images. A signal processing model that is consistent with physiological literature is developed in the paper to simulate the behaviour of simple and complex cells of the primary visual cortex in the Human Visual System (HVS). The model is trained with two publicly available stereoscopic image databases to match the perceptual judgement of impaired stereoscopic images. The experimental results demonstrate a significant improvement in prediction performance as compared with several state-of-the-art stereoscopic image quality metrics

    A full-reference stereoscopic image quality metric based on binocular energy and regression analysis

    Get PDF
    The recent developments of 3D media technology have brought to life numerous applications of interactive entertainment such as 3D cinema, 3DTV and gaming. However, due to the data intensive nature of 3D visual content, a number of research challenges have emerged. In order to optimise the end-to-end content life-cycle, from capture to processing and delivery, Quality of Experience (QoE) has become a major driving factor. This paper presents a human-centric approach to quality estimation of 3D visual content. A full reference quality assessment method for stereoscopic images is proposed. It is based on a Human Visual System (HVS) model to estimate subjective scores of registered stereoscopic images subjected to compression losses. The model has been trained with four publicly available registered stereoscopic image databases and a fixed relationship between subjective scores and the model has been determined. The high correlation of the relationship over a large number of stimuli has proven its consistency over the state-of-the-art
    corecore